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A series of computer simulations using variants of a formal model of attention 
(Melara and Algom, 2003) probed the role of rejection positivity (RP), a slow-wave 
electroencephalographic (EEG) component, in the inhibitory control of distraction. 
Behavioral and EEG data were recorded as participants performed auditory selective 
attention tasks. Simulations that modulated processes of distractor inhibition accounted 
well for reaction-time (RT) performance, whereas those that modulated target excitation 
did not. A model that incorporated RP from actual EEG recordings in estimating distractor 
inhibition was superior in predicting changes in RT as a function of distractor salience 
across conditions. A model that additionally incorporated momentary fluctuations in EEG 
as the source of trial-to-trial variation in performance precisely predicted individual RTs 
within each condition. The results lend support to the linking proposition that RP controls 
the speed of responding to targets through the inhibitory control of distractors. 

Keywords: rejection positivity, selective attention, computational model, inhibitory control, distractor salience 



INTRODUCTION 

Identifying the neural mechanisms that enable flexible, goal- 
directed behavior has been a fundamental aim of cognitive neu- 
roscience (Kerns et al., 2004; Friston, 2009). Yet establishing 
firm links between neural activity and its presumed cognitive 
or behavioral consequences has proved challenging. One rea- 
son is the ethical and technical limits (at least in humans) to 
manipulating behavioral responses from presumed physiologi- 
cal antecedents, relegating to merely correlational many linking 
hypotheses regarding behavioral outcomes (Teller, 1984; Schall, 
2004). The ambiguity inherent in interpreting event-related 
potentials (ERPs), for example, hampers validation of linking 
propositions in cognitive tasks requiring selective attention (e.g., 
Hopf et al., 2004; Martinez et al, 2006; Muller et al, 2006). 
Here, ERP waves to attended signals typically evince greater volt- 
age negativity than ERP waves to unattended signals, revealed 
in an ERP difference component called Nd (see Naatanen, 1990, 
for a review). As behavioral performance worsens with increas- 
ing task difficulty the magnitude of Nd shrinks and its onset 
latency lags. Thus, a specific physiological response (Nd magni- 
tude) is associated with specific behavior outcomes (e.g., target 
reaction time). Yet here as elsewhere the psychological mean- 
ing of the neural activity is open to interpretation. Does Nd 
reflect increased attentional focus to targets? Does it reflect active 
inhibition of distractors? One goal of the present paper was to 
perform a formal computational analysis of ERPs during selec- 
tive attention to evaluate more carefully than hitherto several 
hypothesized links between physiological activity and attentional 
processing. 



REJECTION POSITIVITY 

Recent electrophysiological research has uncovered an ERP com- 
ponent specifically associated with the processing of distractors 
during tasks of selective attention (Mtinte et al., 2010; Mittag 
et al., 2013; cf. Power et al., 2012). The component — named 
rejection positivity (RP) — appears approximately 200 ms after 
distractor onset and can last 400 ms or more thereafter. Originally 
RP was identified in microscopic analyses of Nd: Relative to the 
ERP wave in a non-attention control condition, ERP waves to 
unattended stimuli in an attention task showed increased volt- 
age positivity (Alho et al, 1987; Berman et al., 1989; Michie et al., 
1990; Alain and Woods, 1994; Berman and Friedman, 1995). As 
depicted in Figure 1, rejection positivity thus served as the coun- 
terpoint to the processing negativity (PN) accompanying ERP 
waves to attended stimuli (Naatanen et al., 1978; Naatanen, 1982; 
Bidet-Caulet et al, 2010). Alho et al. (1987) offered the interpre- 
tation that the positivity reflected active suppression of rejected 
signals, hence the term rejection positivity. However, it also is pos- 
sible that the ERP waves elicited in the so-called non-attention 
control condition are biased, either positively or negatively, thus 
leaving room for alternative explanations of RP (e.g., Michie et al., 
1993; Alho etal, 1994). 

Melara et al. (2002) were able to narrow the interpretive field 
of RP using results from an attention-training paradigm, which 
obviated the need for a non-attention baseline. Here, participants 
underwent 3 weeks of training to exercise skills of either audi- 
tory discrimination or distractor suppression, before and after 
being tested in discrimination and selective attention tasks. Thus, 
each participant served as his or her own baseline, against which 
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FIGURE 1 | Graphic depiction of rejection positivity (RP) to distractors 
and processing negativity (PN) to targets in an auditory selective 
attention task (after Bidet-Caulet et al., 2010). Top panel: Grand 
averaged ERPs to attended (in green), control (i.e., non-attention, in black), 
and ignored (in red) stimuli reveal attention-induced separation of 
waveforms beginning with the P2 component, 200 ms after stimulus onset. 
Bottom panel: Difference of ignored vs. control waves (in red), depicting 
RP, and attended vs. control waves (in green), depicting PN. 



RP could be revealed as the product of suppression training. 
Melara et al. found that suppression training, but not discrim- 
ination training, enhanced RP to distractors (but not PN to 
targets) beginning 200 ms after distractor onset. They further 
demonstrated that RP closely co-varied with improved behavioral 
performance during selective attention: The greater the positivity 
to distractors, the faster, less biased, and more accurate partic- 
ipants were in identifying targets. Since targets and distractors 
never overlapped in time, the authors suggested that RP regulates 
the salience of distractors held in working memory in the face 
of responses to currently perceived targets. More recently, Melara 
et al. (2012) showed that the effects of suppression training on RP 
peak several months after the final training session. The results 
of these studies suggest that RP is linked to a participant's ability 
to actively inhibit recent memories of distractors during selective 
attention to targets. 

Other recent research has focused on the neural sources of 
RP. Bidet-Caulet et al. (2010) examined scalp topographies to 
attended and ignored ERP waves subtracted from non-attention 
baseline ERP waves (see Figure 1). The PN to attended tones had 
an earlier onset (150 ms) and a more anterior topography than 
the RP to ignored tones (200 ms). Nevertheless, the center-of- 
mass to both components was concentrated on frontal electrode 



sites. For RP, such activation may represent an executive con- 
trol signal sourced in the frontal cortex, yet aimed at dampening 
activity in sensory cortex. In keeping with this interpretation, 
Chait et al. (2010), using MEG, recently found that distractor 
tones inserted temporally between two comparison tones elicited 
a magnetic RP component in the supra-temporal auditory cortex 
beginning 150 ms after distractor onset. The authors concluded 
that RP activity here reflected the sensory consequences of frontal 
inhibitory control mechanisms (see also Melara et al., 2002; 
Theeuwes and Chen, 2005). 

MANIPULATING DISTRACTOR SALIENCE IN WORKING MEMORY 

In an investigation of monkeys trained to perform visual search, 
Ipata et al. (2006) reported slower and weaker activity to LIP neu- 
rons responsive to salient (popout) vs. non-salient distractors on 
days when the monkeys could successfully ignore the popout. 
Conversely, these LIP neurons were unusually active to popout 
distractors on days when the monkeys were unable to ignore 
them. The authors concluded that frontal control signals serve 
to suppress visual distractor activity on a parietal salience map 
(Gottlieb et al, 1998) thereby permitting easier target search (see 
also Bisley and Goldberg, 2003). 

ERP evidence in humans indicates that RP is modulated 
by the salience of distractors in working memory. Melara 
et al. (2005) investigated salience by manipulating dimensional 
imbalance — that is, the psychophysical change along the dis- 
tractor dimension relative to the target dimension (Melara 
and Mounts, 1993; Algom et al, 1996; Sabri et al., 2001). 
Here, targets and distractors never overlapped in time, ensur- 
ing that the influence of distractors on the processing of targets 
resided in memory. As the remembered salience of distrac- 
tors increased from low to medium to high (across separate 
blocks of trials), participants' behavioral performance to per- 
ceived targets progressively deteriorated: They responded more 
slowly and committed more misidentifications of targets. These 
behavioral changes across conditions were accompanied by a 
monotonic reduction in RP occurring 400 to 600 ms after 
distractor onset. The authors concluded that as the salience 
of remembered distractors sharpened it became steadily more 
difficult for participants to inhibit in working memory, that 
is, which consequently undermined their task performance to 
targets. 

As with many linking propositions, Melara et al.'s (2005) con- 
clusion linking RP to inhibitory control of working memory 
rests solely on correlational evidence, that is, the finding that 
differences in levels of salience were associated both with corre- 
sponding changes in neural activity (RP magnitude) and behavior 
(RT and accuracy). However, these associations themselves spring 
largely from the joint effects created across the different salience 
conditions. Thus, the authors' attribution of inhibitory control 
was merely inferred from group averages. An alternative inter- 
pretation is that RP gauges perceived stimulus salience but plays 
no causal part in managing attentional control. The goal of the 
present study was to explore more precisely the possible role of 
RP in the inhibitory control of distraction in memory using a 
computational model that permitted microscopic (trial-to-trial) 
analyses of the connection between physiology and behavior. 
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FIGURE 2 | Graphic depiction of the time course of activation on a 
filtering trial. Presentation of a target stimulus simultaneously triggered 
excitation of the target's perceptual representation and inhibition of 
memory representations of previous targets and distractors. The time to 
reach threshold is classification RT, modeled as the number of cycles t 
needed to satisfy the stopping rule, which was set in all simulations here to 
(i = 1.0. Imagine moving the vertical threshold line progressively from left 
to right until p = 1 .0. Where the line stops (fi = 1 .0) is the predicted RT on 
the trial. 



MODEL-BASED CONNECTIONS BETWEEN NEURAL ACTIVITIES AND 
BEHAVIORAL PERFORMANCE 

Model-based cognitive neuroscience represents a powerful new 
approach to evaluate linking hypotheses (Gold and Shadlen, 2000, 
2003; Forstmann et al., 2011). Here, cognitive constructs are for- 
malized mathematically, connecting brain activation computa- 
tionally with behavioral outcomes (e.g., Holroyd and Coles, 2002; 
Roitman and Shadlen, 2002; Wang, 2002; Sugrue et al, 2004; 
Purcell et al, 2010; van Maanen et al, 201 1). Purcell et al. (2010), 
for example, formalized the construct of (target and distractor) 
evidence accumulation, which mathematically linked spike activ- 
ity in monkey frontal eye fields (FEF) to saccadic latency during 
visual search. Variants on the model distinguished the form of evi- 
dence accumulation (e.g., perfect, leaky, gated, etc.); goodness of 
fits between observed and predicted RT distributions successfully 
discriminated among model variants. The analysis supported the 
authors' linking proposition that neurons in FEF accumulate per- 
ceptual evidence until threshold, triggering saccadic movement 
toward target or distractor locations. 

The current study formalizes inhibitory control in working 
memory using the computational model proposed by Melara and 
Algom (2003). As shown in Figure 2, here attentional selection 
is conceived as a dual process, concurrently involving excita- 
tion of the current target and inhibition of all trial-irrelevant 
information, including all previous distractors held in memory. 
Distractor salience (among other factors) influences the buildup 
of excitation to the current target and inhibition to distracting 
information. A decision variable connects target choice to a pre- 
defined stopping rule of the ratio of activations between target 
and non-target stimulus values (see Method for details). Thus, 
the speed of responding is intrinsically linked in the model to the 
rate of growth in excitatory and inhibitory activity. 

THE CURRENT MODELING STUDY 

In the original version of the model, Melara and Algom (2003) 
used free mathematical parameters to estimate the rates of exci- 
tatory and inhibitory activity. However, in the current study, we 
used a brain variant of the model to evaluate the linking propo- 
sition that RP is associated as salience grows with the loss of 
inhibitory activation to distractors. Here, recorded values of RP 
substituted for the free parameters in determining the rates of 
activation to distractors. Inhibitory activation thence contributes 
to the weight of evidence (Gold and Shadlen, 2007) the perceiver 
considers about how to respond to the target stimulus. In this way, 
the model connects RP to behavioral decisions through the con- 
struct of inhibitory activation. We asked whether an alternative 
model in which recorded values of PN were used to set excitatory 
activation of targets could equally predict the effects of distractor 
salience on behavioral performance. Thus, we ask, does RP or PN 
better explain the variance in behavioral effects from distractor 
salience? 

We sought in the current study to predict behavioral responses 
from physiological data at both the condition level and the 
individual-trial level. Our behavioral outcome variable was reac- 
tion time (RT). At the condition level, we used RP (and PN) to 
simulate overall behavioral performance in each condition. At 
the individual-trial level, we used momentary fluctuations in the 



low-frequency oscillations to distractors that give rise to RP (i.e., 
RP calculated from single trial EEG, hereafter, RP-noise) to simu- 
late the RT of each participant on each trial. Our modeling efforts 
rest on the assumption that RP-noise is not merely random varia- 
tion in the bioelectric signal, but is in fact meaningfully associated 
with stochastic behavioral processes. We evaluated this assump- 
tion by comparing model simulations that predict behavioral 
variation using RP-noise with those using random noise drawn 
from either Gaussian (hereafter, Gaussian-noise) or rectangular 
(hereafter, Rectangular-noise) distributions. Our final version of 
the model combined EEG-based estimates of inhibitory activity 
with RP-noise to examine how well a fully linked neurobehav- 
ioral model could predict the individual RTs on every trial of every 
participant in every condition. 

METHODS 
DATA SOURCE 

The empirical data used in our simulations were reported as 
Experiment 1 in Melara et al. (2005). Eleven participants were 
tested. Stimuli were rectangular- wave tones in the 1-kHz range, 
100 ms in duration ( 10 ms rise/fall) presented binaurally at 73 dB, 
digitized to 16 bits at a sampling rate of 48 kHz. The stimulus 
set, which appears in Table 1, was used to construct one base- 
line condition (single distractor) and three filtering conditions 
(multiple distractors) of the so-called Garner paradigm (Garner 
and Felfoldy, 1970; Garner, 1974; see Figure 3). Each condition 
included three targets of differing auditory frequency (962 Hz, 
1000 Hz, and 1040 Hz); targets were the same across the four 



Frontiers in Human Neuroscience 



www.frontiersin.org 



August 2014 | Volume 8 | Article 585 | 3 



Chen and Melara 



Computational model of inhibitory control 



Table 1 | Auditory frequencies (in Hz) in each of four conditions (Baseline, Filtering 1, Filtering 2, and Filtering 3). 



Condition 


Target 1 


Target 2 


Target 3 


Distractor 1 


Distractor 2 


Distractor 3 


Baseline 


962 


1000 


1040 


1020 


1020 


1020 


Filtering 1 


962 


1000 


1040 


982 


1020 


1060 


Filtering 2 


962 


1000 


1040 


954 


1020 


1090 


Filtering 3 


962 


1000 


1040 


929 


1020 


1120 




FIGURE 3 | Graphic depiction of the modified Garner paradigm used as 
the data source in the current modeling study. Each filtering task 
contained three distractors, with the middle distractor (D2) the same as in the 
baseline task (1020 Hz), but the pitch range (marked by D 1 and D3) increasing 



progressively from the low-imbalance filtering task (Filtering 1 : D1 = 982 Hz, 
D 3 = 1060 Hz) to the high-imbalance filtering task (Filtering 3: Dj = 929 Hz, 
D3 = 1120 Hz). Note: Filtering 2 (i.e., medium-imbalance task) is absent in the 
depiction. 



conditions. In the baseline condition, the distractor was always 
1020 Hz. In each filtering condition, the distractor set contained 
three tones, the middle one matching the baseline distractor. 
The low and high tones in the distractor set differed increas- 
ingly in within-channel discriminability (physical distance) across 
the three filtering conditions, as summarized in Table 1, creat- 
ing low (Filtering 1), medium (Filtering 2), and high (Filtering 
3) imbalance relative to the target set. Differences between tar- 
gets and distractors were defined by the timbre of the tones, 
targets having a duty cycle of 36%, distractors 18%. Each con- 
dition contained 300 trials, 150 target trials (50 trials per target 
stimulus, P = 0.17) and 150 distractor trials (50 trials per dis- 
tractor stimulus in filtering, P = 0.17, 150 in baseline, P = 0.50). 
Stimuli were selected at random from the target or distractor 
set; targets never appeared together in time with distractors. The 
onset-to-onset interval between any two stimuli ranged randomly 
between 1450 ms and 1600 ms in rectangular distribution. The 
task requirements were identical in each condition: identify each 
stimulus to be one of three targets by pressing one of three keys on 
a keyboard (designated low, medium, and high) with the index, 
middle, and ring fingers of the dominant hand, while ignoring 



and withholding responses to all distractors. Participants listened 
for the frequency differences among targets and the timbre differ- 
ences between targets and distractors using an interactive display 
presented before each block of trials. Each participant was asked 
to respond to targets as quickly as possible while maintaining 
accuracy. Response keys were counterbalanced across partici- 
pants. EEG recordings were time-locked to each stimulus (targets 
and distractors); behavioral performance (RT and accuracy) was 
measured to each target. We only simulated target trials involving 
a correct response (range from 317-568 trials across participants, 
with an average of 474 correct trials in any condition for each 
participant). 

The EEG was recorded from 13 scalp locations (Fz, F3, F4, Cz, 
C3, C4, Pz, P3, P4, T3, T4, T5, and T6 of the International 10-20 
System) with tin electrodes mounted in a stretch cap (Electro-Cap 
International). Only the Fz site was used in modeling. The elec- 
trodes were referenced to linked mastoids (LM and RM) off-line 
with the Fpz as the ground electrode. Impedance was main- 
tained below 2 kf2 across all sites. Blinks and other eye movements 
were monitored by electrooculogram (EOG) from two electrode 
montages, one on the infra- and supra-orbital ridges of the left 
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eye (VEOG), the other on the outer canthi of each eye (HEOG). 
EEG and EOG signals were analog filtered with a band pass from 
0.1 to 100 Hz (-3 dB cutoffs) and digitized at 250 Hz. Trials con- 
taining EEG or EOG activity exceeding 100 |xV was rejected as 
artifacts. The EEG was averaged for each condition from 100 ms 
before the stimulus to 700 ms post-stimulus, using the 100 ms 
pre-fixation period as baseline, with RP amplitude calculated as 
the mean voltage in the 400-600 ms post stimulus window at Fz 
(Figure 4). 

ACTIVATION FUNCTION TO THE CURRENTLY PRESENTED TARGET 

Our overall approach to theorizing and model fitting followed 
Melara and Algom (2003). As depicted in Figure 2, we assumed, 
as they did, that presentation of a target stimulus simultaneously 
triggers activation of the current target's representation, as well 
as representations of targets from previous trials, and represen- 
tations of distractors from previous trials. The perceived target 
on a trial receives excitatory processing with the activation value 
increasing until asymptote or until a decision is made: 

ACt(T - t) = [ l + exp(-slope Tl .t)]^ W 

where the inverse logistic function is raised to a scaling constant 
of 10, Ti is the stimulus value on a target trial (z = 1, 2 or 3, with 
one of three possible targets appearing on each target trial in each 
condition), slope-^ is the slope of activation for T;, and Act(T;,t) 
is the activation value for T; at time f. 

ACTIVATION FUNCTION TO NON-PRESENTED TARGETS ON TARGET 
TRIALS 

In simulating the target trials, each non-presented stimulus, 
including both distractor and non-presented targets, is inhibited 
from target onset. The inhibitory activation value to currently 
non-presented targets in the stimulus set (T;, j = 1. . . 3 and j / i) 
decreases until asymptote or until a decision is made: 

Act(Tj, t) = — ^- — ^ (2) 

' [1 + exp ( - slope x . *t)] 10 

where slope-^ is the slope of the activation function to currently 
non-presented target T; and Act(Tpt) is the activation value for 
Tj at time t. Unlike Melara and Algom (2003), we constrained 
the slope of Tj — a target value not currently presented — to equal 
the slope of the same target on trials in which it had actually 
been presented (i.e., T;). Thus, the activation functions in these 
two situations (i.e., perceived and remembered) differ only in 
direction of activation, that is, increasing (toward an asymptotic 
bound of +1) for perceived targets (Equation 1), reflecting an 
excitatory process, and decreasing (toward a bound of —1) for 
remembered targets (Equation 2), reflecting an inhibitory pro- 
cess. This simplifying assumption reduced free parameters while 
minimally affecting model fits. 

ACTIVATION FUNCTION TO DISTRACTORS ON TARGET TRIALS 

As in the case of non-presented targets, we modeled distrac- 
tor processing as inhibitory activation from the onset of each 
target trial. However, to better equate comparisons with target 
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FIGURE 4 | Grand-averaged waveforms (in AV) of ERPs from midline 


electrode sites (Fz, Cz, and Pz) elicited by distractors in each of the 


four conditions (baseline; low, medium, and high filtering) of the 


target-constant tasks in the source data. Increased distractor 


discriminability reduced RP: a frontal inhibitory slow-wave component 


(400-600 ms after stimulus onset). 
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processing, here we estimated a single inhibitory slope 1 to the 
three different distractors on each target trial, which elevated the 
inhibitory bound of activation to —3: 



Act(D s , t) 



[1 + exp ( — slope d * t)] 



10 



(3) 



where sloped is the average slope of inhibitory activation across 
the three distractors and Act(D s ,t) is the distractor activation 
value on a target trial at time t. 

TRANSFORMATION OF ACTIVATION INTO WEIGHT OF EVIDENCE 

The decision variable about how to respond to the presented tar- 
get is a weight of evidence derived from the log ratio of two types 
of activation: The ratio's numerator contains evidence favoring 
the perceived target, the denominator evidence against the per- 
ceived target which, in the current study, amounts to transformed 
activations of remembered (currently non-presented) targets and 
distractors. Suppose that target T; appears on the current trial. We 
assume that the representation for Tj is activated in the form of 
excitation, and that the representations for all the other stimuli in 
memory are inhibited. The momentary goal of the decision mech- 
anism is to determine whether the weight of evidence favoring T; 
meets a preset stopping rule fi at each time t. The goal is achieved 
by a decision variable that integrates the excitatory evidence for Tj 
with evidence against Tj supplied by inhibitory activation values 
from the other stimuli in the set: 



Evidence(T;, t) = In 



Act(Tj, t) + constant 



J2j Act(Tj, t) + Act(D s , t) + constant ' 
(j= 1...3, andj^i) (4) 

where Act(T;,t) is the activation value for the currently pre- 
sented target Tj at time t (calculated according to Equation 1), 
Act(Tj,t) is the activation value for each currently non-presented 
target T; (j = 1 . . . 3, and j j£ i) at time t (Equation 2), and 
Act(D s ,t) is the activation value for the currently non-presented 
distractors at time f (Equation 3). {constant = 6.0 for all simu- 
lations to scale evidence within the range of the stopping rule). 
Expanding the evidence function into its three different terms 
(i.e., Equations 1-3): 



Evidence(Tj, t) = 



[n- 



1 



[1 + exp ( — slope T . * t)] 1 



+ 6 



£j n. 



+ 6 



[1 + exp( - slope T . * t)] 10 [1 + exp ( - slope d * t)] 10 
(j = 1... 3, andj £ i) (5) 

1 It is possible that some distractors were attentionally more disruptive 
than others. Yet in the original empirical experiment (Melara et al., 2005, 
Experiment 1), the main source of performance errors was within-target 
misidentifications (e.g., confusing a high-pitched target with a middle-pitched 
target), not errors of commission (falsely identifying a distractor as a tar- 
get). The false alarm rate was less than 5% across conditions (though slightly 
higher, 8%, in Filtering 3), and did not differ by distractor stimulus within 
condition. By contrast, target misidentifications were over twice this rate. 
For this reason, and also to reduce modeling complexity, in each filtering 
condition we simulated the joint effects of the three distractors. 



A decision is made and a response emitted once Evidence (T;,t) 
reaches the decision threshold f3. The time it takes to reach the 
threshold is the classification RT, modeled in the present study 
as the number of cycles t (repeated iterations of Equation 5 with 
increasing values of t ) needed to reach the decision threshold. For 
all simulations in the present study, the stopping rule was set to 
p = 1.0 (see Figure 2). 

MODEL-FITTING PROCEDURE 

For each simulated target trial j four slopes (slopeji, slopex2; 
slopex3, and sloped) are needed to calculate the decision vari- 
able for each participant (see Equation 5). The equations used 
to obtain these slopes were modifications of those used by Melara 
andAlgom (2003): 



slope x .[j] = - 



1 



slope d [j] 



+ exp ( — (tstartx; + tswell * x)) 
(i = 1, 2, or 3) 

1 

1 + exp ( — (dstart + dswell * x)) 



* 0.023 (6) 



* 0.023 (7) 



where the logistic function scales the slope and tstart^ (i = 1, 2, 
or 3), tswell, dstart, and dswell are mathematical parameters mod- 
ulating the rate of exponential growth in slope as a function of 
stochastic variability, estimated in the current study either from 
the model-fitting procedure or from actual EEG data. By control- 
ling the rate of target and distractor activation these parameters 
served as the primary basis of differences among the excitatory 
and inhibitory models explored in the current study. The multi- 
plicative constant of 0.023 replaced two additional free param- 
eters in Melara and Algom, thus simplifying the basic model. 
x is the value of a random variable denoting noise on trial 
Changes in the value of x from one trial to the next were the 
basis of trial-to-trial variability in RTs, yielding a predicted RT on 
each individual trial. In simulations using random noise, values 
of x were generated from a Gaussian or a rectangular distribu- 
tion. In simulations using RP-noise, values of x were obtained 
from EEG data as the average voltage 400-600 ms after stimulus 
onset (Figure 4), individually for each artifact- and response-free 
distractor trial. 

We performed model fitting in each simulation using first 
a parameter-estimation stage and then a backfit solution stage. 
The mathematical goal of the parameter-estimation stage was to 
minimize the following objective function: 



objFunc= 

h= 1 



Evidence (T;(h), t) 2 ] 



(8) 



where Evidence (Tj,t) is defined as in Equation 5, yet substituting 
the observed RTs for f. Tj is the target on the hth trial, denoted 
as Tj(h), = 1.0, f is the observed response time on the hth trial, 
and N is the number of trials in a condition. Parameter estimates 
were defined as those values of tstartj i (i = 1, 2, or 3), tswell, 
dstart, and dswell that minimized Equation 8 for a given par- 
ticipant within a given condition. The minimization procedure 
for parameter estimation used the Downhill Simplex method of 
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Nelder and Mead (1965; see also Press et al, 1992; Vetterling et al., 
1992; Gershenfeld, 1999). In a subsequent backfit solution stage, 
the parameter estimates derived in the earlier stage were inserted 
into Equations 6 and 7 with t as an unknown, so that the simulta- 
neous system of equations could then be solved trial-by-trial for 
t in Equation 8. The value of t on trial h that minimized objFunc 
served as the predicted RT to target T; on that trial. 

MODEL EVALUATIONS 

We used two techniques to evaluate model fit: group RT dis- 
tributions and distributional moments. First, to evaluate each 
simulation at the condition level, we compared the averaged 
observed vs. predicted RT distributions. The technique resem- 
bles that devised by Vincent (1912) for viewing learning curves 
and so is often referred to as Vincent's procedure or Vincentizing 
(Ratcliff, 1979). In the current analysis, each participant's RTs 
were first sorted in ascending order and then divided into 20 
quantiles, such that the first 5% of RTs comprised the first quan- 
tile, the second 5% the second quantile, and so on. RTs were then 
calculated by separately averaging RTs in each of 20 individual 
quantiles for each participant in each condition. A group distri- 
bution for each condition was obtained by averaging RTs across 
participants in each quantile. An analogous group distribution 
was derived from the predicted RTs in each simulation. To high- 
light differences among simulations, we depicted the group RT 
probability density functions as smooth line graphs with each 
quantile probability calculated as: 



P(i) = 



50 



RT(i + 1) — RT(i) 



(i= 1,2, ...n- 1) (9) 



where P(i) is the probability of the zth group quantile, RT(i) 
is the quantile RT for the ith group quantile (in ms), and n is 
the number of quantiles. To evaluate the goodness-of-fit of each 
simulation statistically, we compared the observed vs. predicted 
group distributions using chi-square analysis. 

Second, to assess how closely predicted and observed RTs 
corresponded at the level of individual trials, we examined 
the first four moments (mean, variance, skewness and kurto- 
sis) of the individual participant (unaveraged) RT distributions. 
Skewness and kurtosis were calculated using the following equa- 
tions (Cramer, 1946): 



skewness ; 



kurtosis : 



_ Ef =1 (RTi-mean) 3 

n * 8 3 
EtitRTi-mean) 4 
n* S 4 



where n is the number of trials in the condition, RT; is the reaction 
time for the z'th trial, mean is the average RT across the n trials, and 
S is square root of the RT variance in the condition. For each sim- 
ulation, we conducted a repeated-measures analysis of variance 
(ANOVA) on moment values, with Moment (4 levels; mean, vari- 
ance, skewness and kurtosis), Simulation (2 levels: observed vs. 
predicted), and Condition (4 levels: baseline, filtering 1, 2, and 3) 
as within-subject factors. Statistically significant main effects or 
interactions involving Simulation indicate poor model fit. The 



analysis of distributional moments provides finer granularity and 
greater rigor in model evaluation than is possible when examining 
sets of correlation coefficients between observed and predicted 
RTs, where a separate coefficient (from ~475 data points) is 
needed for each participant in each condition. The microanalysis 
of moments complemented the condition-level analysis of group 
distributions by isolating in each model the four central sources 
of trial-to-trial change in behavioral performance (Sternberg, 
1969a,b; Ratcliff, 1979). 

RESULTS 

EXCITATORY ACTIVATION vs. INHIBITORY ACTIVATION IN THE FACE OF 
DISTRACTION 

The aim of the first pair of computer simulations was to com- 
pare fits of an excitation-only model to fits of an inhibition-only 
model as a means of assessing whether participants primarily 
coped with the increasing distractor salience across conditions by 
varying excitation to the targets or by modulating inhibition of 
the distractors. Each model assumes that both short- and long- 
term memory processes determine rates of activation to target 
or distractor representations. The current paper evaluates the 
role of long-term memory in selective attention failure. Roughly, 
the start parameters (tstart in Equation 6, dstart in Equation 7) 
reflect the influence of long-term memory on target or distrac- 
tor activations, respectively, whereas the swell parameters (tswell 
and dswell) describe the interaction of short- and long-term 
memory, with x reflecting momentary fluctuations in short-term 
memory. The excitation-only model emphasizes the relative influ- 
ence on performance of target activations in long-term memory, 
whereas the inhibition-only model emphasizes executive control 
of distractor activations in long-term memory. 

We began each simulation by estimating a set of six free 
parameters in the baseline (i.e., single distractor) condition: four 
parameters (i.e., tstartji, tstartx2, tstarfrn, and tswell in Equation 
6) controlled the excitation of the targets and two parameters (i.e., 
dstart and dswell in Equation 7) the inhibition of the distractors. 
In the inhibition-only version, the distractor parameter dstart 
was estimated separately in each of the four conditions, whereas 
the estimates made at baseline to the remaining five parameters 
(i.e., dswell, tstartn, tstartx2, tstartrj, and tswell) were applied 
to each of the three filtering (i.e., multiple distractor) conditions. 
The inhibition-only model thus required 9 free parameters. In the 
excitation-only version, the tstart parameter to each target (Ti, 
T2, and T3) was estimated separately in each condition, whereas 
baseline estimates of the remaining three parameters (i.e., dstart, 
dswell, and tswell) were used in each of the three filtering con- 
ditions. The excitation-only model required 15 free parameters. 
In both models both inhibitory and excitatory processes were 
activated on each trial. 

Figure 5 depicts density functions of the observed and pre- 
dicted RTs of the inhibition-only and excitation-only models, 
with each panel illustrating a different condition. Chi-square 
tests were performed between observed and predicted RTs 
in each condition to evaluate the goodness-of-fit of each 
model. The two models are identical in the baseline con- 
dition, so produced equally good fits in Figure 5 A Ixfn) = 
4.25]. However, fits of the two models diverged in the three 
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FIGURE 5 | Probability density functions of observed RTs together with predicted RTs for the Inhibition-Only Model (Gaussian noise) and the 
Excitation-Only Model (Gaussian noise), each panel representing one of four conditions (A = Baseline, B = Filtering 1 , C = Filtering 2, and D = Filtering 3). 



Table 2 | The first four distributional moments (mean, variance, 
skewness, and kurtosis) of the observed RT in each of four conditions 
(Baseline, Filtering 1, Filtering 2, and Filtering 3). 



Condition 


Mean 


Variance 


Skewness 


Kurtosis 


Baseline 


696.761 


141.345 


0.833 


0.734 


Filtering 1 


720.994 


153.733 


0.704 


0.283 


Filtering 2 


738.285 


157.574 


0.628 


0.112 


Filtering 3 


748.382 


155.825 


0.612 


0.040 



filtering conditions (Figures 5B-D), with the analysis of pre- 
dicted RTs in the excitation-only model revealing relatively poor 
fit in each condition [Filtering 1, ^ 2 ' 17 ' = 21.35; Filtering 2, 
Xj" 17 ) = 26.79; Filtering 3, X(u) = 23.44]. By contrast, predicted 
RTs of the inhibition-only model, notwithstanding its 6 fewer 
parameters, more closely corresponded to the observed RTs 
[Filtering 1, X.^17) = 10.06; Filtering 2, X( 17 ) = 9.13; Filtering 3, 
X? 17) = 5.41). 

Further insight is gained from an analysis of distribu- 
tional moments. Table 2 summarizes the four distributional 



Table 3 | Difference between predicted and observed moments in RTs 
(mean, variance, skewness, and kurtosis) for the Inhibit ion-Only 
Model and the Excitation-Only Model. 

Condition Mean Variance Skewness Kurtosis 



INHIBITION-ONLY MODEL (GAUSSIAN NOISE) 



Baseline 


-0.164 


-0.543 


0.012 


0.282 


Filtering 1 


-1.884 


-16.056 


0.176 


-0.253 


Filtering 2 


-1.717 


-12.208 


0.102 


0.688 


Filtering 3 


0.365 


-2.351 


0.017 


0.088 


EXCITATION-ONLY MODEL (GAUSSIAN NOISE) 


Baseline 


-0.164 


-0.543 


0.012 


0.282 


Filtering 1 


-19.732 


-25.176 


-0.111 


-0.137 


Filtering 2 


-25.608 


-22.109 


0.170 


0.877 


Filtering 3 


-24.014 


-26.100 


0.073 


0.491 



parameters (mean, variance, skewness and kurtosis) of the 
observed RTs in each condition. Table 3 tallies the differences 
between observed and predicted moments for each model. As 
one can see, the excitation-only model underestimated in each 
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filtering condition both the mean RT and the RT variance 
(mean RT underestimated by 20 ms, 26 ms, and 24 ms in the 
Filtering 1, Filtering 2, and Filtering 3 conditions; RT variance 
by 25 ms, 22 ms, and 26 ms). RT predictions of the inhibition- 
only model led to slightly compressed estimates of variance, 
but only in the Filtering 1 and 2 conditions; otherwise the 
model more closely mimicked all four distributional moments. 
Overall, despite its 6 fewer parameters, the inhibition-only model 
edged the excitation-only model in capturing distributional char- 
acteristics of each participant in each condition. Nevertheless, 
ANOVA indicated that neither model yielded exceptional fits 
to the observed data, with both simulations evincing statis- 
tically significant differences between observed and predicted 
moment values [Excitation-Only: Simulation, F(i t io) = 36.1, 
p < 0.001; Simulation x Condition, F (3j 30) = 15.07, p < 0.001; 
Simulation x Moment, F( 3 30 ) = 30.57, p < 0.001; Simulation x 
Condition x Moment, F(9 ; 90) = 12.21, p < 0.001; Inhibition- 
Only: Simulation, F(i, 10) = 9.34, p < 0.01; Simulation x 
Condition, F( 3i 30 ) = 6.5, p < 0.001; Simulation x Moment, 
F( 3 30 ) = 11.36, p < 0.001; Simulation x Condition x Moment, 
F (9 , 90) = 7.12,p < 0.001]. 

REJECTION P0SITIVITY vs. PROCESSING NEGATIVITY 

The aim of the second pair of simulations was to evaluate the link- 
ing hypothesis that RP is a biomarker of inhibitory control during 
tasks of selective attention, ultimately affecting the speed to decide 
target identity. To achieve this goal, we estimated sloped in each 
condition using actual RP amplitudes. Parameter estimation for 
the RP model began with a linear regression in which the esti- 
mates of dstart from the four conditions of the inhibition-only 
simulation were used to predict slow- wave amplitude of each par- 
ticipant's averaged ERP waves at Fz, 400-600 ms after distractor 
onset (see Melara et al, 2005). By estimating only the regression 
coefficient and the Y intercept we were able to reduce the total 
number of free parameters of the inhibition-only model from 9 
to 7. For example, a participant's dstart in the Filtering 1 condi- 
tion equaled the Y intercept added to the product of the regression 
coefficient and the average RP amplitude in that condition. After 
obtaining dstart values, the tstart, tswell, and dswell parameters 
of Equations 6 and 7 were re-estimated in the baseline condi- 
tion using Downhill Simplex to obtain the best baseline fit. As 
comparison, we simulated a PN model in which slopexi, slopei2, 
and slopex 3 in each condition were estimated using magnitudes 
of target PN, that is, each participant's slow-wave ERP amplitude 
to the target at Fz, 400-600 ms after target onset. We used a lin- 
ear regression method analogous to that used in the RP model to 
estimate the tstart parameters in each condition, which reduced 
the total number of free parameters of the excitation-only model 
from 15 to 9. This approach allowed us to evaluate a neurobe- 
havioral model of inhibition-only (RP model) directly against a 
neurobehavioral model of excitation-only (PN model). 

Figure 6 contains the Vincentized RT density distributions 
from the predictions of the RP and PN models. Results 
with these neurobehavioral models mirror those obtained 
with their parameterized counterparts: Fits of the RP model 
with observed data were relatively good [Baseline, x.n 7 ) = 
3.91; Filtering 1, Xn 7 ) = 1L8S ; Filtering 2, xfn) = U.96; 



Filtering 3, X( 17 j = 6.65], whereas those of the PN model aligned 
with observed RTs only in the baseline condition [Baseline, 
X( 17) = 5.12; Filtering 1, X( i7 ) = 34.16, p < 0.01; Filtering 2, 
X| 17) = 31.52, p < 0.05; Filtering 3, x|i 7J = 41.91, p < 0.01]. 
Inspection of distributional moments (Table 4) indicated, once 
again, that the excitation-based (PN) model was poor during 
filtering at reproducing measures of central tendency and vari- 
ability (except in Filtering 3; mean RT underestimated by 31 ms, 
31 ms, and 26 ms in the Filtering 1, Filtering 2, and Filtering 3 
conditions; variance in RT by 34 ms, 35 ms, and 1 ms). The RP 
model, on the other hand, accurately reproduced all moments, 
with the exception of variance in the Filtering 1 (—19 ms) and 
Filtering 2 (—14 ms) conditions, reminiscent of the inhibition- 
only model (cf. Table 3). As we shall see, underestimation of 
RT variance in the two inhibition models may owe to proper- 
ties of the underlying (Gaussian) noise distribution. In any case, 
it is worth noting that the RP model, with only 7 free param- 
eters, was at least as good in capturing the RT distributions as 
the inhibitory-only model with 9 free parameters, and better 
than the PN model, also with 9 free parameters. This outcome 
illustrates that actual RP voltage is viable in predicting specific 
properties of RTs in auditory selective attention tasks, suggest- 
ing that RP activity serves as a brain biomarker of processes of 
distractor inhibition. Still, ANOVA uncovered statistically signif- 
icant differences between observed and predicted moment values 
in both the PN and RP models [PN:F {h W ) = 6.12, p < 0.05; RP: 
F ( i, 10) = 25.31, p < 0.001]. 

RANDOM NOISE vs. EEG NOISE 

Each of the foregoing simulations employed Gaussian noise — 
computer-generated random numbers from a standard normal 
distribution — to induce change in RT from one trial to the next. 
The third set of simulations aimed to evaluate the linking hypoth- 
esis that momentary distractor-evoked fluctuations in RP voltage 
are meaningfully associated with stochastic behavioral processes. 
We evaluated this hypothesis by comparing the initial inhibition- 
only model using Gaussian-noise with a comparable model that 
predicted behavioral variation using RP-noise. An additional sim- 
ulation implemented a third variant of the inhibition-only model, 
one that generated RT variability from Rectangular-noise. An RP 
distribution was created for each participant in each condition 
using a single-trial EEG analysis (see Parra et al., 2002; Philiastides 
and Sajda, 2006) in which the amplitude in microvolts of RP at 
Fz 400-600 ms after stimulus onset was obtained to each distrac- 
tor trial in which the participant correctly withheld a key press 
and rescaled by multiplying a constant. In extracting single-trial 
EEG signals to solitary distractors we eschew any inherent physio- 
logical information about response processing, such as the motor 
activity to targets. 

Each model contained 9 free parameters. In each of the three 
simulations, RTs were sorted from shortest to longest for each 
participant in each condition, without regard to the particular 
target (Ti, T2, or T 3 ) presented on a trial. A set of n values from 
each of the three distributions (Gaussian, RP, and Rectangular) 
was selected and sorted from smallest to largest. The backfit 
solution procedure was then applied such that, for each trial, 
one of the n distributional values served as the value of x in 
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FIGURE 6 | Probability density functions of observed RTs together with predicted RTs for the RP Model (Gaussian noise) and the PN Model 
(Gaussian noise), each panel representing one of four conditions (A = Baseline, B = Filtering 1, C = Filtering 2, and D = Filtering 3). 



Table 4 | Difference between predicted and observed moments in RTs 
(mean, variance, skewness, and kurtosis) for the RP Model and the 
PN Model. 



Condition 


Mean 


Variance 


Skewness 


Kurtosis 




RP MODEL (GAUSSIAN NOISE) 


Baseline 


-0.164 


-0.543 


-0.012 


0.282 


Filtering 1 


-1.884 


-16.056 


-0.176 


-0.253 


Filtering 2 


-1.536 


-14.241 


0.121 


0.847 


Filtering 3 


-5.960 


-7.300 


-0.003 


0.214 


PN MODEL (GAUSSIAN NOISE) 


Baseline 


0.001 


-2.038 


-0.109 


0.051 


Filtering 1 


-31.380 


-33.911 


-0.089 


0.302 


Filtering 2 


-31.083 


-35.336 


0.149 


0.788 


Filtering 3 


-26.222 


1.102 


0.896 


7.116 



Equations 6 and 7, with the smallest random number assigned 
as x to the trial having the shortest RT, and so on through the 
set. Density functions of predicted RTs from each model appear 
with observed RTs in the panels of Figure 7. The commonly 
used Gaussian-noise model, as already noted, underestimated 



RT variance in two of the three filtering tasks. One can see in 
Figure 7 that this model was particularly lackluster in simulat- 
ing RTs in the 600-800 ms range. The Rectangular-noise model, 
however, proved even worse in fitting the observed data in each of 
the four conditions [Baseline, X.(i7) = 43.11, p < 0.01; Filtering 
1, X( 17) = 45.92, p < 0.01; Filtering 2, x| 17) = 37.66, p < 0.01; 

22.13, nil. Predicted RT distributions cre- 



(17) 



Filtering 3, x 

ated from Rectangular-noise were too platykurtic and skewed 
excessively toward the positive tail (see Table 5). Interestingly, we 
found that a model using fluctuations in EEG as the source of 
trial-to-trial variation in RT yielded the best fit. Simulated RTs 
from the RP-noise model quite accurately mirrored observed RT 
variance (together with the other three moments) across condi- 
tions. In fact, ANOVA revealed no significant differences between 
observed and predicted moment values in the RP-noise model 
[F(i t io) = 1.47, ns], and no interactions of Simulation with 
Moment [F( 3 30 ) = 2.29, ns], Condition [_F( 3 30 ) = 0.59, ns], or 
both [F(9. 90) = 0.95, ms]. This finding suggests that momentary 
RP activity to distractors is closely associated with trial-to-trial 
variability in speed of responding to targets. 

Why did RP-noise provide a better fit than Gaussian-noise 
(or Rectangular-noise)? We examined standardized probability 
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FIGURE 7 | Probability density functions of observed RTs together 
with predicted RTs for the Inhibition-Only Model using three 
methods for generating inter-trial variability: Gaussian noise, RP 



distributions of RP-noise, Gaussian-noise, and observed RTs in 
each condition. The observed data more consistently resem- 
bled the RP-noise distributions in both variance and skewness 
across conditions, with distributional co-variation slightly higher 
between RT and RP-noise probabilities (accounting for 96% of 
the variance) than between RT and Gaussian probabilities (94% 
of variance). This slender advantage was enough to give the RP- 
noise simulations the edge in predicting individual RTs on a 
trial-by-trial basis. 

FULLY LINKED SIMULATION 

We found in the previous set of simulations that a model imple- 
menting RP-noise provided a closer match to observed RTs than 
one implementing Gaussian-noise. Earlier simulations showed 
that models that modulate distractor inhibition — in the form of 
either the inhibition-only model or the RP model — offered better 
fits than models that varied target excitation. In the final pair of 
simulations we incorporated RP-noise into the two models that 
modulate distractor inhibition, namely, (1) an inhibition-only 



noise, and Rectangular Noise. Each panel represents one of four 
conditions (A = Baseline, B = Filtering 1, C = Filtering 2, and 
D = Filtering 3). 



version that estimates dstart using only free parameters and (2) 
an RP version that estimates dstart using each participant's signal- 
averaged RP component. The aim of these simulations was to 
assess the viability of a fully neurobehavioral model: the RP model 
with RP-noise. We expected that this model, with 7 parameters, 
would fit the data as well as the inhibition-only model with RP 
noise, with 9 parameters, our best model to date. This comparison 
thus provides an especially strong test of the linking hypothesis 
under study. 

Figure 8 reveals that both models provided excellent fits to 
the observed data. Analysis of distributional moments, depicted 
in Table 6, confirms that the models captured key distributional 
properties of the data, particularly with regard to variability 
and skewness, with neither model showing significant differences 
between observed and predicted moment values [Inhibition-Only: 
_F(! 10 ) = 1.47, ms; -RP: F(i, io) = 3.51, ns\. A common feature 
of the two models was the slight overestimation of peakedness 
during performance of the filtering tasks, creating predicted dis- 
tributions more leptokurtic than the observed distributions (see 
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Table 5 | Difference between predicted and observed moments in RTs 
(mean, variance, skewness, and kurtosis) for theGaussian-noise, 
RP-noise, and Rectangular-noise simulations of the Inhibition-Only 
Model. 

Condition Mean Variance Skewness Kurtosis 





Baseline 


-0.164 


-0.543 


-0.012 


0.282 


Filtering 1 


-1.884 


-16.056 


-0.176 


-0.253 


Filtering 2 


-1.717 


-12.208 


0.102 


0.688 


Filtering 3 


0.365 


-2.351 


-0.017 


0.088 


RP NOISE 


Baseline 


-0.223 


-0.848 


-0.038 


-0.007 


Filtering 1 


0.157 


-3.274 


0.111 


0.592 


Filtering 2 


-0.525 


-7.011 


0.114 


0.574 


Filtering 3 


-0.192 


-5.820 


0.088 


0.484 


RECTANGULAR NOISE 


Baseline 


1.415 


-5.574 


-0.516 


-1.809 


Filtering 1 


1.378 


-9.102 


-0.415 


-1.383 


Filtering 2 


0.688 


-10.003 


-0.305 


-1.178 


Filtering 3 


1.912 


-2.711 


-0.245 


-1.088 



Table 6). Nevertheless, the correlation between observed and pre- 
dicted RTs exceeded 0.98 for both models. Thus, with two fewer 
free parameters, the RP model with RP noise was as good as 
the inhibition-only model with RP noise at characterizing the 
observed RT distributions. 

DISCUSSION 

Four sets of computer simulations using variants of a formal 
model of selective attention (Melara and Algom, 2003) probed 
the role of RP in the inhibitory control of distraction. The sim- 
ulations used behavioral and electrophysiological data recorded 
in an earlier study where the degree of distractor salience var- 
ied as participants performed an auditory selective attention task 
involving asynchronous presentation of targets and distractors 
(Melara et al, 2005). Each simulation sought to predict RTs 
to targets in the face of different degrees of distraction. The 
first pair of simulations compared predictions of an inhibition- 
only vs. excitation-only version of the model, in each case using 
free parameters to set levels across conditions of inhibitory or 
excitatory activity, respectively. We found that the inhibition- 
only model alone accounted satisfactorily for detailed features 
of the observed RT distributions as distractor salience changed. 
A second pair of simulations implemented two neurobehavioral 
versions of the model: an RP model, which incorporated each par- 
ticipant's averaged slow- wave activity to distractors in estimating 
slopes of distractor inhibition, vs. a PN model, which incorpo- 
rated averaged slow- wave activity to targets in estimating slopes of 
target excitation. The RP model, but not the PN model, captured 
changes in RT performance closely as a function of distractor 
salience. 

The third set of simulations explored how well different 
sources of stochastic variation predicted individual RTs. One 
source was the momentary fluctuations in slow-wave RP activ- 
ity (RP-noise) measured using single-trial EEG analysis. We 



compared RP-noise with Gaussian-noise and Rectangular-noise, 
with values drawn randomly from computer-generated nor- 
mal and rectangular distributions, respectively. We found that 
RP-noise mimicked inter-trial RT variation more closely than 
Gaussian-noise and, especially, Rectangular-noise. Finally, in a 
fully linked simulation, we showed that a version of the RP model 
that included RP-noise provided an exceptionally good fit to the 
behavioral data, with the fewest free parameters. Our results are 
consistent with the hypothesis that RP activity influences the 
speed of responding to targets during selective attention through 
the inhibitory modulation of distractors. 

COMPUTATIONAL LINKS BETWEEN PHYSIOLOGY AND BEHAVIOR 

Melara et al. (2005) found that behavioral performance to tar- 
gets changed systematically as a function of distractor salience, 
and was accompanied by a progressive change in RP amplitude 
to distractors, but not in PN amplitude to targets. The authors 
concluded that RP serves to suppress intrusions from previ- 
ously presented distractors as participants make decisions about 
currently perceived targets. However, average RP amplitude was 
merely correlated in their study with levels of distractor salience. 
Thus, it is conceivable that RP functions simply to monitor stim- 
ulus salience, playing no causal role in distractor inhibition. The 
primary contribution of the current study is in showing that the 
magnitude of RP to distractors closely predicts an individual's RT 
to targets through modulation of inhibitory control. We there- 
fore were able to link RP explicitly to behavior in the context of 
a formal computational model of distractor processing. Equally 
important, we found that PN to targets was relatively ineffective 
in predicting RTs through the modulation of target activity. Thus, 
here at least, distractor processing (RP) was more diagnostic of 
target behavior during selective attention than target processing 
itself (PN). Later we discuss possible reasons for this result. 

The current study focused on a single behavioral measure: RT. 
A host of recent electrophysiological studies have pointed to the 
connection between response latency and the amplitude of var- 
ious ERP components, including Nl (e.g., Melara et al., 2005), 
P2 (e.g., Sheehan et al, 2005; Tong et al, 2009), MMN (e.g., 
Atienza et al, 2002), and P3b (e.g., Alain et al., 2007). Ours is 
among relatively few studies (e.g., Holroyd and Coles, 2002) able 
to demonstrate the link between EEG amplitude and microscopic 
changes in RT. Still, other EEG measures, including Nd onset 
latency and power in EEG bands, and other behavioral measures, 
including target misses and distractor false alarms, are sensitive 
to attention manipulations, such as distractor salience. Indeed, 
in the study that served as the empirical basis of the current 
simulations, Melara et al. (2005) identified several connections 
between physiology and behavior. The goal of the current study 
was to concentrate on a single linking hypothesis. Future stud- 
ies should aim to relate internal processes to other response 
measures, including target accuracy, false alarms, misses, and 
incorrect identifications. 

REPRESENTING STOCHASTIC BEHAVIORAL PROCESSES USING 
NEUROPHYSIOLOGY ACTIVITY 

A further contribution of the current study was in demonstrat- 
ing the feasibility of incorporating momentary fluctuations in 
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FIGURE 8 | Probability density functions of observed RTs together with predicted RTs for the Inhibition-Only Model (RP noise) and the RP Model (RP 
noise), each panel representing one of four conditions (A = Baseline, B = Filtering 1, C = Filtering 2, and D = Filtering 3). 



neurophysiological activity into a formal model to predict trial- 
level performance during selective attention. Our simulations 
using RP-noise revealed that electroencephalographic measure- 
ments excelled over random Gaussian or rectangular activity in 
capturing trial-to-trial variations in RT to targets. Indeed, a fully 
neurobehavioral version of the model (RP model with RP-noise) 
was able to account successfully for the RTs of every participant 
on every trial in every condition of the study. 

To be sure, the fidelity of RT prediction from random 
noise was perforce determined by statistical characteristics of 
the computer-generated distributions we implemented. Had 
the stochastic processes been modeled using a probability dis- 
tribution with a slightly different skewness and kurtosis the 
results perhaps would imitate those with RP-noise. Our aim 
was not to abandon the use of random noise distributions 
in modeling stochastic behavioral processes. Nevertheless, the 
success of RP-noise in predicting RT variability in the cur- 
rent study serves, at the very least, as proof of concept 



that nuanced behavioral performance during selective atten- 
tion can be linked closely to internal neurophysiological 
events. 

Moreover, our results hold potential theoretical significance 
in suggesting that RP-noise involves activation of distractor rep- 
resentations that, ultimately, carry behavioral consequences. On 
this view, variability in responding to a currently presented target 
is due in part to the moment-to-moment changes in activation 
of recently presented distractors, presumably held in working 
memory. Fluctuations in distractor activation contribute to the 
uncertainty inherent on target trials, captured in the model 
as trial-to-trial shifts in the rates of activation to both targets 
(slopeji, slopex2, slopex3; see Equation 6) and distractors (slopej; 
see Equation 7). In this context, one might characterize RP-noise 
as reflecting the momentary success or failure of endogenous 
attentional processes (such as inhibitory control) in suppressing 
distractor representations while maintaining representations of 
target relevance. 
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Table 6 | Difference between predicted and observed moments in RTs 
(mean, variance, skewness, and kurtosis) for the Inhibition-Only 
Model with RP Noise and the RP Model with RP Noise. 

Condition Mean Variance Skewness Kurtosis 



INHIBITION-ONLY MODEL (RP NOISE) 



Baseline 


-0.223 


-0.848 


-0.038 


-0.007 


Filtering 1 


0.157 


-3.274 


0.111 


0.592 


Filtering 2 


-0.525 


-7.011 


0.114 


0.574 


Filtering 3 


-0.192 


-5.820 


0.088 


0.484 


RP MODEL (RP NOISE) 


Baseline 


7.253 


1.018 


-0.055 


-0.002 


Filtering 1 


-1.074 


-4.641 


0.115 


0.707 


Filtering 2 


-5.741 


-10.161 


0.111 


0.657 


Filtering 3 


-2.928 


-9.387 


0.079 


0.544 



COMPARISON WITH OTHER ATTENTION THEORIES: BIASED 
COMPETITION AND ATTENTI0NAL TRACE THEORY 

The biased competition model (Desimone and Duncan, 1995; 
Reynolds et al., 1999) has been the theory of choice in explaining 
evidence of attentional modulation using single- and multi-unit 
recordings of neurons in primary and secondary areas of visual 
cortex (see Reynolds and Chelazzi, 2004, for a review). A canon- 
ical paradigm for these studies is visual search, in which animals 
peruse a field of images in search of a target object (e.g., Moore 
and Fallah, 2001, 2004; Bichot et al, 2005; Hayden and Gallant, 
2005). The model predicts that representations of the target object 
in working memory bias the competition among items within 
the receptive field to favor cortical cells responsive to the spatial 
location of the target, perhaps through gain control. The closer 
the match between the target representation and an item within 
the receptive field (or, conversely, the stronger the mismatch with 
other competing items) the greater the competitive edge the item 
enjoys in activating spatially corresponding cells in visual cortex. 

Simulations in the current study relied on a different model 
of attention: the tectonic theory of Melara and Algom (2003). 
Tectonic theory also emphasizes competition in working mem- 
ory between target and distractor representations as the basis 
of attentional control. However, in contrast to the biased com- 
petition model, the theory discounts target-distractor similarity 
as the primary source of competition, and space-based modula- 
tion as the primary mechanism of attentional control (see also 
Treue and Martmez-Trujillo, 1999). Instead, in tectonic theory 
the chief threats to inhibitory control are distractor salience and 
stimulus uncertainty. On this view, representational variability 
in targets or distractors, rather than representational compari- 
son processes pitting targets against distractors, best characterizes 
the dynamic relationship between attention and working mem- 
ory. In fact, Melara et al. (2005) used their empirical results 
to claim that representational variability (tectonic) dominates 
target-distractor similarity (biased competition) in undermining 
selective attention. The current results go further in demonstrat- 
ing that control mechanisms lawfully govern distractor variability 
in working memory at both the individual-trial level and the 
condition level. 



Why is biased competition a less satisfactory account here 
compared with its other applications? One possibility consid- 
ers the specific processing demands of the attention paradigm 
employed. Kahneman and Treisman (1984) divided the selec- 
tion paradigms most commonly used in attention research into 
two categories: filtering and selective set. In filtering paradigms, 
including the dual-channel task used by Melara et al. (2005), 
participants are asked to focus on signals from a task-relevant 
channel (e.g., auditory frequencies of a certain timbre) while dis- 
regarding signals from a task-irrelevant channel (e.g., frequencies 
of a different timbre). In selective-set paradigms, including visual 
search (e.g., Chelazzi et al., 2001; Bichot et al., 2005), participants 
anticipate or must locate one stimulus among several possible 
other stimuli. The two paradigms impose different information 
processing burdens on working memory. During selective set, 
working memory acts as a convenient hub for testing the degree 
of alignment between an expected or sought-after stimulus and 
one currently within gaze or awareness. Here, a premium is placed 
on mechanisms of comparison, so distractors that resemble those 
expected or searched should impair processing efficiency. By con- 
trast, during filtering, working memory acts as a sorting center to 
separate signal (task relevant) from noise (task irrelevant). Here, a 
premium is placed on mechanisms of suppression, so distractors 
high in noise (variability) should impair processing efficiency. On 
this view, models of target-distractor similarity, such as biased 
competition, would emerge most naturally from results obtained 
with selective-set paradigms, whereas models of representational 
variability, such as tectonic theory, would emerge from results 
obtained with filtering paradigms. 

One notable counterexample is Naatanen's (1982, 1992) atten- 
tional trace theory, which holds that selective attention perfor- 
mance in filtering paradigms is determined by how closely stimuli 
in a task- irrelevant channel match the trace in working memory 
representing stimuli in the task-relevant channel. The closer the 
match, the larger is the amplitude of PN, indicating greater fil- 
tering task difficulty. In this way, attentional trace theory uses 
target-distractor similarity during filtering to explain competition 
in working memory between task-relevant and task-irrelevant 
signals. Here one must distinguish, however, between two types of 
target-distractor similarity: within channel and between channel 
(see Melara, 1992). Attentional trace theory focuses on between- 
channel similarity: PN is greater the smaller the physical separa- 
tion between signals defined as task relevant vs. task irrelevant 
(e.g., greater PN for 50 Hz separation between channels than 
400 Hz separation; see Hansen and Hillyard, 1983). Melara et al. 
(2005), on the other hand, manipulated within-channel simi- 
larity: Targets defined by one dimension (e.g., timbre) could be 
more or less similar to distractors along the dimension requiring 
identification (e.g., pitch). 

Using this distinction, one might reason that mechanisms 
of comparison operate in attention systems whenever similar- 
ity between target and distractor channels must be resolved — 
common in selective-set paradigms — whereas mechanisms of 
suppression operate in attention systems whenever representa- 
tional variability within distractor channels must be controlled — 
common in filtering paradigms. In line with this reasoning, we 
found in this study of within-channel filtering that RP, indicative 
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of distractor suppression, was highly predictive of individual RTs 
to targets whereas PN, indicative of target-distractor comparison, 
was not. We might expect the fit of PN neurobehavioral mod- 
els to improve when simulating paradigms where the similarity 
between target and distractor channels is varied systematically, 
whether employing selective-set or filtering paradigms. 

CONCLUSIONS 

Computational analysis revealed that modulation of distractor 
inhibition using RP voltage accurately predicts the speed of target 
responding as a function of distractor salience. Our analysis fur- 
ther revealed that oscillations in RP account well for trial-to-trial 
variations in target performance. In using actual neural activity 
to computationally model inhibitory control between trials and 
between tasks, the current study establishes a solid link between 
RP magnitude and RT performance during selective attention. 
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